LXGB: a machine learning algorithm for estimating the discharge coefficient of pseudo-cosine labyrinth weir

One of the practical and financial solutions to increase the efficiency of weirs is to modify the geometry of the plan and increase the length of the weir to a specific width. This increases the discharge coefficient (Cd) of the weir. In this study, a new weir referred to pseudo-cosine labyrinth weir (PCLW) was introduced. A hybrid machine learning LXGB algorithm was introduced to estimate the Cd of the PCLW. The LXGB is a combination of the linear population size reduction history-based adaptive differential evolution (LSHADE) and extreme gradient boosting (XGB) algorithm. Seven different input scenarios were presented to estimate the discharge coefficient of the PCLW weir. To train and test the proposed method, 132 data series, including geometric and hydraulic parameters from PCLW1 and PCLW2 models were used. The root mean square error (RMSE), relative root mean square error (RRMSE), and Nash–Sutcliffe model efficiency coefficient (NSE) indices were used to evaluate the proposed approach. The results showed that the input variables were the ratio of the radius to the weir height (R/W), the ratio of the length of the weir to the weir height (L/W), and the ratio of the hydraulic head to the weir height (H/W), with the average values of RMSE = 0.009, RRMSE = 0.010, and NSE = 0.977 provided better results in estimating the Cd of PCLW1 and PCLW2 models. The improvement compared to SAELM, ANFIS-FFA, GEP, and ANN in terms of R2 is 2.06%, 3.09%, 1.03%, and 5.15%. In general, intelligent hybrid approaches can be introduced as the most suitable method for estimating the Cd of PCLW weirs.

www.nature.com/scientificreports/ experimental results proved the superiority of the SVM compared with counterpart adaptive neuro-fuzzy inference systems (ANFIS) and artificial neural networks (ANNs). Bilhan et al. 18 estimate the C d of labyrinth weirs using support vector regression (SVR) and an outlier robust extreme learning machine. The results showed that machine learning methods estimated the C d values more accurately. Safarrazavizadeh et al. 19 performed a laboratory investigation of the flow on labyrinth weirs with a semicircular and sinusoidal plan. Observations showed that the discharge coefficient in labyrinth weirs with a semicircular and sinusoidal plan, unlike linear weirs, has an upward trend in low water loads (H T /P < 0.35) and decreases after reaching its maximum value. Bonakdari et al. 20 investigated the effectiveness of the gene expression programming (GEP) method for estimating C d . Results show that the GEP method provides better results in predicting C d . Shafiei et al. 21 used the ANFIS-firefly algorithm (ANFIS-FFA) method to estimate the C d of triangular labyrinth weirs. Results showed that the ANFIS-FFA model is more accurate in predicting the C d of triangular labyrinth weirs. Emami et al. 8 estimated the C d of W-planform labyrinth weirs using the improved self-adaptive differential evolutionary algorithm and support vector regression (ISaDE-SVR) method. ISaDE-SVR is highly effective in estimating the C d of W-planform weirs. Norouzi et al. 22 simulated C d using a self-adaptive robust learning machine (SAELM) model. The results showed that the SAELM model estimated the C d with high accuracy. Wang et al. 23 , investigated the application of genetic algorithm (GA), particle swarm optimization (PSO), and traditional BP neural network in predicting the C d of triangular labyrinth weir. The results showed that GA-BPNN and PSO-BPNN methods have high efficiency in predicting C d . Chen et al. 24 used SVM, random forest (RF), linear regression, SVM, k-nearest neighbor (KNN), and decision tree (DT) in predicting the C d of streamlined weirs. Ahmad et al. 25 used the ANN model to predict the C d of an arced labyrinth side weir. The results indicated that C d calculated by ANN is more accurate. Emami et al. 26 used the Walnut algorithm and SVR method to predict the C d of triangular labyrinth weirs. Safari et al. 27 evaluated ANN, GEP, and regression models to estimate the C d of the broad-crested weir. The results showed that ANN estimates the C d better than GEP models and regression models.
In the previous studies, according to the many geometrical models that have been investigated by different researchers, the C d of PCLW has not been investigated. Therefore, in the present study, by using the intelligent model of the differential evolution (LSHADE) and extreme gradient boosting (XGB) approach, the C d of the PCLW was estimated. The proposed approach was investigated with different combinations of features to identify the high-performance combination of features.
The contributions of this paper are as follows: (a) Introducing the LXGB algorithm, which integrates the LSHADE with XGB to tune the XGB parameters and further enhance its estimation performance. (b) Using the LXGB algorithm to estimate the C d of PCLW. The proposed algorithm models the (c) Evaluating the proposed model with a real-world dataset and compared with state-of-the art algorithms.
The experimental results show the superiority of the proposed method compared with counterparts in terms of performance measures.
The remaining sections of this study are organized as follows. Section "Material and methods" illustrates the experimental materials and the presented hybrid approach. Section "Results and Discussion" presents the results and discussions. Section "Conclusion" summarizes the paper and supplies recommendations for coming work.

Material and methods
Dimensional analysis. The 1-dimensional equation of the flow on the PCLW is as follows 28 : where Q is the discharge, g shows the acceleration of gravity, L is the length of the weir, and H T is the hydraulic height (h + V 2 /2 g). The C d of labyrinth weirs in free flow conditions depends on geometric and hydraulic parameters as follows: where B is the channel width, H d is the total hydraulic height (downstream of the weir), V shows the flow velocity, W indicates the height of the weir, R is the radius of weir curvature, S is the length of the straight part between the curves of the weir, t is the thickness of the weir, α represents the angle of the straight section between the weir curves with the direction of the channel, N indicates the number of cycles, ρ indicates the fluid density, μ the dynamic viscosity, σ shows the surface tension, CS means the shape of the weir crest, JS denotes the shape of the flowing blade, and SW represents the approaching flow and the sidewall effect.
Equation (2) can be written as follows: where Re is the Reynolds number, We mean the Weber number, and Fr is the Froude number. Henderson 29 concluded that if Re < 2000, the effect of viscosity can be neglected. Novak et al. 30 concluded that if the water height on the weir is more than 3 to 4 cm, the effect of surface tension is ignored. Due to the turbulent flow and minimum water height of 5 cm on the weir, the impacts of the Re and We numbers were removed. The shape of the edge of all used weirs was selected as a sharp-crested, and the effect of CS was ignored. Due to the installation Cd L 2g HT 1.5 (2) www.nature.com/scientificreports/ of weirs perpendicular to the main flow and the absence of local contraction at their installation location, the conditions of the approaching SW flow were considered the same for all experiments. Equation (3). is simplified as the following equation: Experimental models. The simulation of the flow around the PCLW was carried out in a channel with a width, length, and height of 0.49 m to 1.115 m, 3.2 m, and 0.5 m, respectively. In Fig. 1, the PCLW models and their geometric features are shown. The geometric features and the range of experimental parameters of the PLCW are presented in Table 1.
Extreme gradient boosting (XGB). XGB 31-33 is a robust supervised learning solution to regression, classification, and ranking problems in a fast and accurate way. XGB is a more generalized form of gradient-boosting decision trees. It utilizes parallel processing, resolves missing values efficiently, prevents overfitting, and performs well on datasets of different sizes.

For a given dataset with n examples and m features
, XGB consists of an ensemble of K classification and regression trees (CARTs). The final prediction is formulated as follows 31 : y i is the final predictive value, F is the list of CARTs, and f k (x i ) is the function of input in the k-th decision tree. In the XGB, the objective function consists of two components: regularization and training error, which are defined as follows 31 : where n i=1 l(y i ,ŷ i ) calculates the difference between the predicted value and the observed value of the loss function.
K k=1 �(f k ) calculates the regularization component, which is:  www.nature.com/scientificreports/ where γ is the leaf penalty coefficient, T is the total number of a leaf node, guarantees that the scores of a leaf node are not too large, and w is the scores of a leaf node. XGB employs the gradient boosting strategy, appends one new tree at each iteration, and modifies the preceding test results by fitting the residuals of the previous prediction: Integrating Eq. (1) and (2), the objective function for the t-th tree can be written as 31 : Taking the Taylor expansion of the loss function up to the second order, Eq. (9) can be approximated as follows: where g i = ∂ŷ K−1 l(y i ,ŷ K−1 ) and h i = ∂ 2ŷK−1 l(y i ,ŷ K−1 ) are the first and second-order gradient statistics of the loss function.
The optimal weight w j of leaf j, and the objective function of a tree can be written as follows: where G i = i∈I j g i and H i = i∈I j h i + . the weak fitting model will be intensified as follows: where η is the learning rate. XGB appends new trees at each iteration by continuously dividing features. Appending a new tree to the model is learning a new function f k (X, θ k ) to fit the residual of previous prediction. Once K trees are learned, the strong fitting model F(x i ) used to predict: where, F(x i ) is the strong-fitting model. Figure 2 shows the working principle of XGB.
Since the hyper-parameters of XGB are often set empirically, optimal tuning of parameters is essential for designing robust XGB. In this paper, we used the LSHADE algorithm to tune the XGB parameters including the number of decision trees (K), learning rate ( η ), maximum depth (md), minimum child weight (mcw), gamma value ( γ ), sub-sample (ss). Table 2 lists the XGB parameters and their range used in the implementation.

LSHADE.
Success-history-based parameter adaptation for differential evolution (SHADE) 34 is an adaptive evolutionary optimization strategy. LSHADE 35 enhances SHADE with a linear population size reduction technique, which gradually reduces the size of the population using a linear function. LSHADE starts its optimiza- www.nature.com/scientificreports/ tion process with a randomly generated population of real parameter vectors. The algorithm repeats a process of trail vector generation and selection until some termination conditions are satisfied.

LSHADE-XGB (LXGB).
The incentive mechanism of LXGB is to improve the classification performance of XGB by integrating the LSHADE optimization algorithm with XGB. Figure 3 shows the working principle of the LXGB algorithm.
RMSE: Root mean square error; NSE: Nash-Sutcliffe model efficiency coefficient; RRMSE: Relative root mean square error. where X i is the predicted values, Y i is the observed values, and X is the average of X.

Results and discussion
The C d of PCLW1 and PCLW2 weirs was estimated using the hybrid LXGB approach. At first, all available data were normalized to remove or correct outliers 36 .
Yi Figure 2. A big picture of the XGB method. www.nature.com/scientificreports/ where X min is the minimum data, X represents the raw data, X max is the maximum data, and X n is the normalized data. The ratio of the weir length to the weir height (L/W), the ratio of the channel width to the weir height (B/W), the ratio of the weir thickness to weir height (t/W), the number of cycles (N), the radius to the weir height (R/W), the ratio of the straight section between the weir curves length to the weir height (S/W), the ratio of the, the ratio of the hydraulic head to the weir height (H/W), were considered as input parameters of the LXGB approach. 132 datasets, including geometric and hydraulic parameters, were selected. The data were randomly divided into two parts: 80% (106 data) for training the model and 20% (26 data) for testing it.
Seven models with different variables were examined to introduce the most influential input parameters in estimating the C d of PCLW1 and PCLW2 weirs. Tables 3 and 4

and Figs. 4 and 5 present various input variables.
In Tables 5 and 6, the evaluation criteria for different input variables to estimate the C d are presented. A part of the modeling process by the LXGB approach is presented in Fig. 6.
The results show the accuracy of the presented LXGB approach in estimating the C d of PCLW1 and PCLW2 models of PCLW. Mahmoud et al. 37 concluded that the ANFIS-PSO and MLP-FA (multi-layer perceptron and firefly optimization algorithm) methods are the most accurate in estimating the C d of triangular labyrinth weirs, respectively. In a similar study, Majediasl and Fuladipanah 38 concluded that the SVM model produces the most exact results in predicting the C d of labyrinth weir with RMSE = 0.0118. Shafiei et al. 21 reported that the ANFIS-FFA model is quite accurate in estimating the C d of the labyrinth weir. Karami et al. 10 showed that the ELM method with RMSE = 0.006 has acceptable efficiency in estimating the C d of the labyrinth weir. In a similar study, the effectiveness of the least-squares support vector machine-bat algorithm (LSSVM-BA) method was used to investigate the discharge of a curved labyrinth weir 39   www.nature.com/scientificreports/ www.nature.com/scientificreports/ The results of the estimated and observed C d of the PCLW1 and PCLW2 models of pseudo-cosine labyrinth weirs were compared in Figs. 7 and 8. According to the results, the K 6 model with the input variables of (R/W), (L/W), and (H/W), had the optimal values of statistical indicators. The C d of PCLW1 and PCLW2 weirs increases with the increase of the weir height. In a similar study, it was concluded that with the increase in the weir height, the C d of the triangular duckbill labyrinth weir increases, which is in agreement with the results of the present study 7 . The increase in the effective length of the labyrinths at a specified width, due to the radius increases of  www.nature.com/scientificreports/ PCLW1 and PCLW2 weirs causes an increase in the Cd. The studies showed that increasing the radius causes a reduction in eddy flows, turbulence, and a sudden increase in water height during the weir 39,40,42 . The results of the investigations showed that with the increase of R/W, the C d increases in the arched labyrinth weir, which is consistent with the results of the present study 41 . Also, the K 2 model (H/W, L/W, R/W, N) is in the second rank, which shows that length, weir height, radius, and the number of cycles have a more significant impact on C d of PCLW1 and PCLW2 weirs. By increasing the number of labyrinth weir cycles, discharge and Cd increase, which is consistent with the results of the present study 40,43 . Figure 9 shows the importance of the influential input parameters in estimating the C d of PCLW. Emami et al. 44 predicted the C d of a curved plan labyrinth weirs using the WOA-ANFIS method, and the input parameters H/W and θ (weir arc angle) were introduced as the most effective parameters in estimating the C d . Majediasal and Fuladipanah 38 , investigated the support vector machine (SVM) method for C d of sharp-crested triangular labyrinth weirs and concluded that the input combination, including geometric parameters (θ, h/w, L/B), has the best results. Mohammadi et al. 45 reported that the parameters H t /P, W/P (the ratio of the weir width to the height), R/W, W/LC (the ratio of the weir width to the effective length) as input variables have the most accuracy and efficiency in estimating the C d of U-shaped labyrinth weirs. Haghiabi et al. 46 indicated the C d of triangular labyrinth weirs using the ANFIS system and concluded that the ANFIS has a proper implementation in C d estimation. Studies showed that the H/W parameter is the most influential parameter on the C d of a labyrinth and arced labyrinth weirs 47 . Table 7 compares the performances of the XGB and LXGB on the test dataset. The results show the superiority of the LXGB compared with the XGB algorithm in terms of performance measures. This issue proves that combining the LSHADE with XGB improves the estimation performance.

Data availability
The datasets generated and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.